AITopics | hindsight inverse dynamic

Neural Information Processing Systems http://nips.cc/

hindsight inverse dynamic, learning, pchid, (11 more...)

Neural Information Processing Systems

Country:

Asia > Vietnam > Hanoi > Hanoi (0.04)
North America > Canada (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
(2 more...)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Policy Continuation with Hindsight Inverse Dynamics

Neural Information Processing SystemsDec-25-2025, 05:52:57 GMT

Solving goal-oriented tasks is an important but challenging problem in reinforcement learning (RL). For such tasks, the rewards are often sparse, making it difficult to learn a policy effectively. To tackle this difficulty, we propose a new approach called Policy Continuation with Hindsight Inverse Dynamics (PCHID). This approach learns from Hindsight Inverse Dynamics based on Hindsight Experience Replay. Enabling the learning process in a self-imitated manner and thus can be trained with supervised learning. This work also extends it to multi-step settings with Policy Continuation. The proposed method is general, which can work in isolation or be combined with other on-policy and off-policy algorithms. On two multi-goal tasks GridWorld and FetchReach, PCHID significantly improves the sample efficiency as well as the final performance.

hindsight inverse dynamic, name change, policy continuation, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Policy Continuation with Hindsight Inverse Dynamics

Hao Sun, Zhizhong Li, Xiaotong Liu, Bolei Zhou, Dahua Lin

Neural Information Processing SystemsOct-2-2025, 13:16:53 GMT

Solving goal-oriented tasks is an important but challenging problem in reinforcement learning (RL).

artificial intelligence, machine learning, reinforcement learning, (11 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

To Review 1: 2 Q1: The connection between the policy and the Hindsight Inverse Dynamics(HID). Instead of mapping (s

Neural Information Processing SystemsOct-2-2025, 13:16:38 GMT

We thank all reviewers for their insightful comments. Please see the responses below. Q2: Why is it important to relabel data to learn HID? And multistep HIDs help such extrapolations in non-trivial cases. And Fig.1(b) below shows similar results in For most goal-oriented tasks, the learning objective is to find a policy to reach the goal as soon as possible.

artificial intelligence, machine learning, pchid, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.76)

Add feedback

Reviews: Policy Continuation with Hindsight Inverse Dynamics

Neural Information Processing SystemsJan-22-2025, 22:41:17 GMT

The paper presents a new approach for inverse dynamics learning which is extended to goal conditioned, multi-step inverse dynamics. The approach is combined with standard RL algorithms to solve multi-goal tasks such as the OpenAI Fetch environment. All reviewers liked the ideas presented in the paper and appreciated the contributions. The experiments were also well executed and the results are convincing. I am also convinced that the paper offers interesting aspects in the field of multi-goal RL and recommend this paper for a spotlight presentation.

hindsight inverse dynamic, policy continuation

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.35)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.35)

Add feedback

Policy Continuation with Hindsight Inverse Dynamics

Neural Information Processing SystemsOct-9-2024, 20:12:40 GMT

Solving goal-oriented tasks is an important but challenging problem in reinforcement learning (RL). For such tasks, the rewards are often sparse, making it difficult to learn a policy effectively. To tackle this difficulty, we propose a new approach called Policy Continuation with Hindsight Inverse Dynamics (PCHID). This approach learns from Hindsight Inverse Dynamics based on Hindsight Experience Replay. Enabling the learning process in a self-imitated manner and thus can be trained with supervised learning.

hindsight inverse dynamic, policy continuation

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.31)

Add feedback

Policy Continuation with Hindsight Inverse Dynamics

Sun, Hao, Li, Zhizhong, Liu, Xiaotong, Zhou, Bolei, Lin, Dahua

Neural Information Processing SystemsMar-19-2020, 00:47:59 GMT

Solving goal-oriented tasks is an important but challenging problem in reinforcement learning (RL). For such tasks, the rewards are often sparse, making it difficult to learn a policy effectively. To tackle this difficulty, we propose a new approach called Policy Continuation with Hindsight Inverse Dynamics (PCHID). This approach learns from Hindsight Inverse Dynamics based on Hindsight Experience Replay. Enabling the learning process in a self-imitated manner and thus can be trained with supervised learning.

hindsight inverse dynamic, policy continuation

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.31)

Add feedback

Policy Continuation with Hindsight Inverse Dynamics

Sun, Hao, Li, Zhizhong, Liu, Xiaotong, Lin, Dahua, Zhou, Bolei

arXiv.org Machine LearningNov-1-2019

Solving goal-oriented tasks is an important but challenging problem in reinforcement learning (RL). For such tasks, the rewards are often sparse, making it difficult to learn a policy effectively. To tackle this difficulty, we propose a new approach called Policy Continuation with Hindsight Inverse Dynamics (PCHID). This approach learns from Hindsight Inverse Dynamics based on Hindsight Experience Replay, enabling the learning process in a self-imitated manner and thus can be trained with supervised learning. This work also extends it to multi-step settings with Policy Continuation. The proposed method is general, which can work in isolation or be combined with other on-policy and off-policy algorithms. On two multi-goal tasks GridWorld and FetchReach, PCHID significantly improves the sample efficiency as well as the final performance.

hindsight inverse dynamic, learning, pchid, (11 more...)

arXiv.org Machine Learning

1910.14055

Country: